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(57) Abstract 

Resource requests made by clients of origin servers in a network are intercepted by reflector mechanisms and selectively reflected to 
other servers called repeaters. The reflectors select a best repeater from a set of possible repeaters and redirect the client to the selected 
best repeater. The client then makes the request of the selected best repeater. The resource is possibly rewritten to replace at least some of 
the resource identifiers contained therein with modified resource identifiers designating the repeater instead of the origin server. 
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Optimized Network Resource Location 

1. Field of the Invention 

This invention relates to replication of resources in computer networks. 

2. Background of the Invention 

The advent of global computer networks, such as the Internet, have led to 

entirely new and different ways to obtain information. A user of the Internet can now 
access information from anywhere in the world, with no regard for the actual location of 
either the user or the information. A user can obtain information simply by knowing a 
network address for the information and providing that address to an appropriate 
application program such as a network browser. 

The rapid growth in popularity of the Internet has imposed a heavy traffic 
burden on the entire network. Solutions to problems of demand (e.g., better 
accessibility and faster communication links) only increase the strain on the supply. 
Internet Web sites (referred to here as "publishers") must handle ever-increasing 
bandwidth needs, accommodate dynamic changes in load, and improve performance for 
distant browsing clients, especially those overseas. The adoption of content-rich 
applications, such as live audio and video, has further exacerbated the problem. 

To address basic bandwidth growth needs, a Web publisher typically subscribes 
to additional bandwidth from an Internet service provider (ISP), whether in the form of 
larger or additional "pipes" or channels from the ISP to the publisher's premises, or in 
the form of large bandwidth commitments in an ISP's remote hosting server collection. 
These increments are not always as fine-grained as the publisher needs, and quite often 
lead times can cause the publisher's Web site capacity to lag behind demand. 

To address more serious bandwidth growth problems, publishers may develop 
more complex and costly custom solutions. The solution to the most common need, 
increasing capacity, is generally based on replication of hardware resources and site 
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content ^ as mirroring,, and aup1ica.cn ^« resources. These so uuon , 
however, are difficult and expensive to deploy and operate As a tesui, only the largest 
publishers can afford them, since only those publishers can amorrize the costs over 

many customers (and Web site hits). 

A number of solutions have been developed to advance replicanon and 
mirroring. In general, these technologies are designed for use by a single Web site and 
do not include features that allow their components to be shared by many Web sues 

simultaneously. 

Some solution mechanisms offer replicadon software that helps keep nurrored 
servers up-to-date. These mechanisms general* operate b, making a complete copy of a 
Be s^tem. One such system operates by transparently keeping muldple copies of a 61e 
system in synch. Another system provides mechanisms fo, explicidy and regularly 
Ipying files that have changed. Database systems are parucularly difficult to re F bcate, 
as they are condnuaUy changing. Several mechanisms allow for replicadon of databases, 
although there are no standard approaches for accompBshing it. Severalcompan.es 
offering proxy caches describe mem as replicadon tools. However, proxy caches dtffe, 
becausemeyareoperatedonbehalfofdientsratherthanpubbshers. 

Once a Web site is served b, multiple servers, a challenge is to ensure tha, the 
lo ad is appropriately distributed or balanced among those servers. Domain name-server- 
basca round-robin address resoludon causes different clients to be directed to different 
mirrors. 

Another solution, load balancing, takes into account the load at each server 
measured in a variety of ways) to select which server should handle a pardcular request. 

Load balancers use a variety of techniques to route the request to the appropriate 
server Most of those load-balancing techniques require that each server be an exact 
r e P lica of the primary Web she. Load balancers do not take into account the "network 
distance" between the client and candidate mirror servers. 
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Assuming that client protocols cannot easily change, there are two major 
problems in the deployment of replicated resources. The first is how to select which 
copy of the resource to use. That is, when a request for a resource is made to a single 
server, how should the choice of a replica of the server (or of that data) be made. We 
call this problem the "rendezvous problem". There are a number of ways to get clients 
to rendezvous at distant mirror servers. These technologies, like load balancers, must 
route a request to an appropriate server, but unlike load balancers, they take network 
performance and topology into account in making the determinadon. 

A number of companies offer products which improve network performance by 
prioritizing and filtering network traffic. 

Proxy caches provide a way for client aggregators to reduce network resource 
consumption by storing copies of popular resources close to the end users. A client 
aggregator is an Internet service provider or other organization that brings a large 
number of clients operating browsers to the Internet. Client aggregators may use proxy 
caches to reduce the bandwidth required to serve web content to these browsers. 
However, traditional proxy caches are operated on behalf of Web clients rather than 
Web publishers. 

Proxy caches store the most popular resources from all publishers, which means 
they must be very large to achieve reasonable cache efficiency. (The efficiency of a 
cache is defined as the number of requests for resources which are already cached 
divided by the total number of requests.) 

Proxy caches depend on cache control hints delivered with resources to 
determine when the resources should be replaced. These hints are predictive, and are 
necessarily often incorrect, so proxy caches frequently serve stale data. In many cases, 
proxy cache operators instruct their proxy to ignore hints in order to make the cache 
more efficient, even though this causes it to more frequendy serve stale data. 

Proxy caches hide the activity of clients from publishers. Once a resource is 
cached, the publisher has no way of knowing how often it was accessed from the cache. 
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SUMMARY OF THE INVENTION 
This invention provides away for servers in a computer network to off-load 
.eirprocessingofrequests for selected resources by determining a different server (a 
Repeater") to process those revests. The selection of the repeater can be made 
dynamically, based on information about possible repeaters. 

If a requested resource contains references to other resources, some or all of 
these references can be replaced by references to repeaters. 

Accordingly, in one aspect, this invention is a memod of process** resource 

from an observer, the request induding a resource identifier for the particular 
resource, the resource identifier sometimes mcluo^g an indication of the or^ server. 

server, since they are sent to the origin server, they do not need to name «. A 
danism referred to asarefiector, co-located with the origin server, intercepts the 

Jhandleitlocally. If the reflector decides to handle the request locally, tt forwards*. 

.quest is reflected, the client is provided with a modified resource idennfier de Slg naung 

the repeater. . 

The dien, gets the modified resource identifier from me reflector and makes a 
^t for rhe particular resource from me repearer designated in me modified resource 

identifier. . 

When the repeal gets the client's request, it responds by rerumtng the 
tequcsted resource to me dien. If me repeater has a local cop, of the resourcerhen .t 
returns mat copy, otWise i, forwards the revest to the origin serve, to ge, the 
source, and saves a local copy of the resource in order ,„ serve subsequent requests. 

The selection by the reflector of an appropriate repeater to handle the request 
ean be done in a number of ways. In me preferred embodiment, it is done by first pre- 
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partitioning the network into "cost groups" and then determining which cost group the 
client is in. Next, from a plurality of repeaters in the network, a set of repeaters is 
selected, the members of the set having a low cost relative to the cost group which the 
client is in. In order to determine the lowest cost, a table is maintained and regularly 
updated to define the cost between each group and each repeater. Then one member of 
the set is selected, preferably randomly, as the best repeater. 

If the particular requested resource itself can contain identifiers of other 
resources, then the resource may be rewritten (before being provided to the client). In 
particular] the resource is rewritten to replace at least some of the resource identifiers 
contained therein with modified resource identifiers designating a repeater instead of the 
origin server. As a consequence of this rewriting process, when die client requests other 
resources based on identifiers in the particular requested resource, the client will make 
those requests directly to the selected repeater, bypassing the reflector and origin server 
entirely. 

Resource rewriting must be performed by reflectors. It may also be performed 
by repeaters, in the situation where repeaters "peer" with one another and make copies 
of resources which include rewritten resource identifiers that designate a repeater. 

In a preferred embodiment, the network is the Internet and the resource 
identifier is a uniform resource locator (URL) for designating resources on the Internet, 
and the modified resource identifier is a URL designating the repeater and indicating the 
origin server (as described in step B3 below), and the modified resource identifier is 
provided to the client using a REDIRECT message. Note, only when the reflector is 
"reflecting" a request is the modified resource identifier provided using a REDIRECT 
message. 

In another aspect, this invention is a computer network comprising a plurality of 
origin servers, at least some of the origin servers having reflectors associated therewith, 
and a plurality of repeaters. 
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Brief Description of the Drawings 

Tbe above and other objects and advantages of the invention will be apparent 
upon consideration of the following detailed description, taken in con)uncuon *nth me 

invention; and . 

*«» M «e Bow Cham of the operaio,. of -he preset .nv« 0 o 0 . 



DETAILED DESCRIPTION OF THE 

Presently preferred exemplary Embodiments 
Overview 

• „r, network environment 100 according to the 
FIGURE 1 shows a portion of a networie envir 

Lver (hereinorigm server 102) Stains andkeeps track of a number of 
e^essomeoraUo^ 

connected to a particular repeater known as its "contact" repeater C Repeater B 104b m 
connected p Pre ferably each reflector maintains a connection with 

the svstem depicted m FIGURE 1). FreieraDiy 

the system P reoeater maintains a connection with a 

a single repeater known as its contact, and each repeater m 

• ^rrer>eater(e.g. repeater 104m for repeaters 104a, 
special repeater known as its master repeater (e.g., P 

104b and 104c in FIGURE 1). 

„ S pa~ of *e «*. lOXby im p,<« cohceot 
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mirror selected resources provided by origin servers in response to clients' HTTP 
(hypertext transfer protocol) and FTP (file transfer protocol) requests. 

A client 106 connects, via the network 100, to origin server 102 and possibly to 
one or more repeaters 104a etc. 

Origin server 102 is a server at which resources originate. More generally, the 
origin server 102 is any process or collection of processes that provide resources in 
response to requests from a client 106. Origin server 102 can be any of f-the-shelf Web 
server. In a preferred embodiment, origin server 102 is typically a Web server such as 
the Apache server or Netscape Communications Corporation's Enterprise™ server. 

Client 106 is a processor requesting resources from origin server 102 on behalf of 
an end user. The client 106 is typically a user agent (e.g., a Web browser such as 
Netscape Communications Corporation's Navigator™) or a proxy for a user agent. 
Components other than the reflector 108 and the repeaters 104a, 104b, etc., may be 
implemented using commonly available software programs. In particular, this invention 
works with any HTTP client (e.g., a Web browser), proxy cache, and Web server. In 
addition, the reflector 108 might be fully integrated into the data server 112 (for instance, 
in a Web Server). These components might be loosely integrated based on the use of 
extension mechanisms (such as so-called add-in modules) or tightly integrated by 
modifying the service component specifically to support the repeaters. 

Resources originating at the origin server 102 may be' static or dynamic. That is, 
the resources may be fixed or they may be created by the origin server 102 specifically in 
response to a request. Note that the terms "static" and "dynamic" are relative, since a 
static resource may change at some regular, albeit long, interval. 

Resource requests from the client 106 to the origin server 102 are intercepted by 
reflector 108 which for a given request either forwards the request on to the origin server 
102 or conditionally reflects it to some repeater 104a, 104b, etc. in the network 100. 
That is, depending on the nature of the request by the client 106 to the origin server 102, 
the reflector 108 either serves the request locally (at the origin server 102), or selects one 
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selected repeater. In other words, the reflector «8 — rcaue.ts for resources 
origin server 102, made by client «. - be either served W *e 

„L-p«* » ^ w4a - ,o4b * e,c - ^r°; : 

Repeaters 104., 104b, etc. are intermediate ptocessors used to service diem 
requests thereby improving performance and reducing costs in the manner desenbed 

. 104b etc ate any processes or collections of processes 

herein. Within repeaters 104a, 104b, etc, are any p r » nea ter 

tha, deliver resources to the dient 106 on behalf of the o n0 n serve, 102. A re eater 
ma, indude a repeater cache 110, used to avoid unnecessary „ns «h the ongm 

SerVCr reflector 108 is a mechanism, preferably a software program, dm intercepts 
quests mat would normally be sen, directly to the origin server 102. While shown . 
thl drawings as separate component the reflector 1.8 and the origa, -rver 12 
w ica U yco.oca t ed,eg.,o„a P a«icuUrsyste J nsuchasdata server «. (Asd^sed 
Iw. the reflector 108 may even be a "plug in" moduie that becomes part of the on S „ 

server 102. . * 

RGURElshowsonlyapartofanetworklOOaccordingtomtsmvennon. A 

orif m serve.. Reflectors communica^ with me repeater netwo.lt, and repeaters - me 
network communicate with one another. 
Uniform Resource Locators 

Each location in a computer network has an address which can generally be 

™b,r< In order to access information, an address tor 
specified as a series of names or numbers. In order to „,.„.. w j,» 

L informal mustbeltnown. For example, on the World W^e Web ( the Web , 
which is a subset of the Interne, the manner in which information address locauons are 
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provided has been standardized into Uniform Resource Locators (URLs). URLs specify 
the location of resources (information, data files, etc.) on the network. 

The notion of URLs becomes even more useful when hypertext documents are 
used. A hypertext document is one which includes, within the document itself, links 
(pointers or references) to the document itself or to other documents. For example, in 
an on-line legal research system, each case may be presented as a hypertext document. 
When other cases are cited, links to those cases can be provided. In this way, when a 
person is reading a case, they can follow cite links to read the appropriate parts of cited 



cases. 



In the case of the Internet in general and the World Wide Web specifically, 
documents can be created using a standardized form known as the Hypertext Markup 
Language (HTML). In HTML, a document consists of data (text, images, sounds, and 
the like), including links to other sections of the same document or to other documents. 
The links are generally provided as URLs, and can be in relative or absolute form. 
Relative URLs simply omit the parts of the URL which are the same as for the 
document including the link, such as the address of the document (when Unking to the 
same document), etc. In general, a browser program will fill in missing parts of a URL 
using the corresponding parts from the current document, thereby forming a fully 
formed URL including a fully qualified domain name, etc. 

A hypertext document may contain any number of links to other documents, 
and each of those other documents may be on a different server in a different part of the 
world. For example, a document may contain links to documents in Russia, Africa, 
China and Australia. A user viewing that document at a particular client can follow any 
of the Unks transparently (Le., without knowing where the document being linked to 
actually resides). Accordingly, the cost On terms of time or money or resource 
allocation) of following one link versus another may be quite significant. 

URLs generally have the following form (defined in detail in T. Berners-Lee et al, 
Uniform Resource locators (URL), Network Working Group, Request for Comments: 1738, 
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Category: Standards Track, December 1994, located at 

"http://d,in t emic J ,«/rfc/rfcl738.m", which is hereby incorporated heretn b, 
reference): 

scheme:/ /bost[$ort]l»rl-path , 

a* "file" (for a file on the local system), Jtp (tor a 
where "scheme" can be a symbol such as ^ ^or a 

« FTP file server) "AW" (for a file on a file on a Web server),and 
file on an anonymous FTP file server;, o F \ 

•W (for a connecfion to a Teinet-based — * « 

and new scheme, are added every now and then. The port number is opuonal, the 

glided. Tne^neldrnaps.oaparncura.netwoAaddressfor.pamcu.ar 
clputer Tne-un-pam-is xelaove to the computer specified in the '%ost^ field. A 

Fore^ple.meMowingisaUW.idenfifyingaae-F-mmepath A/B/C 

on a computer at "wimMtpU^': 

bmJIwuw.mpu.gn'IAIBICIF 

in order to access the Be T" (me resource) specified b, me above URL, a 
program (e.g.. a browser) running on , use* computer C ..e., » Cent compute,) wouid 
L^lt locate me computer ^ a server computer) specified by me bos, nam. 
Ic meprogramw„uldhave K ,.oca K *eserver-W.^. To do thrs, ,. wouid 
access a Domain Name Server (DNS), providing the DNS with the host name 
C WW). TheDNSactsas.kmdofcent^direcm^forresolvmg 
addresses from names. If me DNS determines that the» is a (remote server) computer 
corresponding to me name - * - *< «* » 

I „ t eLpro t oco,(or I P,»ddressandithasmeformn 2 3.MS.«.67S,Tneprog^ 
on me user, (dien.) computer would men use me actuai address to access me remote 
(server) computer. 
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The program opens a connection to the HTTP server (Web server) on the 
remote computer "wmv.uspto.gov" and uses the connection to send a request message to 
the remote computer (using the HTTP scheme). The message is typically an HTTP 
GET request which includes the url-path of the requested resource, "A/B/C/F'. The 
HTTP server receives the request and uses it to access the resource specified by the url- 
path "A/B/C/F'. The server returns the resource over the same connection. 

Thus, conventionally HTTP client requests for Web resources at an origin server 
102 are processed as follows (see FIGURE 2) (This is a description of the process when 

no reflector 108 is installed.): 

Al. A browser (e.g., Netscape's Navigator) at the client receives a resource 

identifier (Le.,.a URL) from a user. 

A2. The browser extracts the host (origin server) name from the resource 

identifier, and uses a domain name server (DNS) to look up the network 
(IP) address of the corresponding server. The browser also extracts a 
port number, if one is present, or uses a default port number (the default 
port number for http requests is 80). 

A3. The browser uses the server's network address and port number to 
establish a connection between the client 106 and the host or origin 
server 102. 

A4. The client 106 then sends a (GET) request over the connection 
identifying the requested resource. 

A5. The origin server 102 receives the request and 



A6. locates or composes the corresponding resource. 



WO 99/40514 



PCT/US99/01477 



12 



15 



20 



25 



A7 The origin server « then sends back to the client 106 a reply contauung 
me request resource (or some form of error indicator if the resource - 
unavailable). The .ply is sen. to the client over the same connecnon as 
that on which the request was received from the diem. 



A8. 



The client 106 receives the reply from the origin serve 102. 



•ftere are many variations of this t»sic model. For example, in one varianon, 
instead of providing the client with the resource, the origin server can tell the ^client to 
I^eslreso^hyanothername. Todoso,inA7 the server 10. sends back, 
the cLt 106 a reply caned a "REDIRECT" which contains , new URL md-canng the 

intervention, this rime requesting the resource identified by the new URL 

System Operation 

In mis invention reflect 108 effectively takes the place of an ordinary Web 
server or origin server 102. Tne reflector 108 does mis by taking over the ori S n server s 
IP address and port number. In mis way.when a client tries to connect" the orrgm 
^erm.itwU.acmailyconnecttomereflec.rlOS. Tneoriginal Web «« (or 

sTlP address but on a different port number. Tnus, using this invennon, me server 

referred to in A3-A7 above is actually a reflector 108. 

Note that it is also possible to leave me origin serve* network address as .t ,s 
Md to let the reflector „„ a, a different address or on a different por. In this way the 
teflector does not intercept requests sent to the origm server, bu, can suU be sent 
quests addressed specific* to the reflector. Thus the system can be tested and 
configured without interrupting its normal operation. 
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The reflector 108 supports the processing as follows (see FIGURE 3): 
upon receipt of a request, 

Bl The reflector 108 analyzes the request to determine whether or not to 

reflect the request. To do this, first the reflector determines whether the 
sender (client 106) is a browser or a repeater. Requests issued by 
repeaters must be served locally by the origin server 102. This 
determination can be made by looking up the network (IP) address of 
the sender in a list of known repeater network (IP) addresses. 
Alternatively, this determination could be made by attaching information 
to a request to indicate that the request is from a specific repeater, or 
repeaters can request resources from a special port other than the one 
used for ordinary clients. 

B2 If the request is not from a repeater, the reflector looks up the requested 
resource in a table (called the "rule base**) to determine whether the 
resource requested is "repeatable". Based on this determination, the 
reflector either reflects the request (B3, described below) or serves the 
request locally (B4, described below). 

The rule base is a list of regular expressions and associated 
attributes. (Regular expressions are well-known in the field of computer 
science. A small bibliography of their use is found in Aio, et aL, 
"Compilers, Principles, techniques and tools", Addison-Wesley, 1986, 
pp. 157-158.) The resource identifier (URL) for a gjven request is looked 
up in the rule base by matching it sequentially with each regular 
expression. The first match identifies the attributes for the resource, 
namely repeatable or local. If there is no match in the rule base, a default 
attribute is used. Each reflector has its own rule base, which is manually 
configured by the reflector operator. 
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B3 To reflect a request, (to serve . request locally go to B4), 
as show, in FIGURE 4, the reflector deterges (B3-1) the best tepeate, 
to reflect the request to, as described in detatl below. Th. reflector then 
creates (K4 a new resource identifier (URL) (using the requested URL 
and the best repeater) tha, identify the same resource a, the selected 

"^'"it is necessary drat the reflccuon step create a single URL 
containing the URL of the original re.ou.ee, as weB as the identuy of the 
selected repeater. A special form of URL is created to provde thts 
information. This is done by creaunga new URL as foUows: 

31 Given a repeater name, scheme, ong.n server name and path, create a 
' „„, URL. If the scheme is "http". the preferred embodiment uses the 

following format: 

bttb:/ 1 <repeatef>l <server> I <p*th> 

u the ^referred embodiment uses the 

If the form used is other than http , the preterrea 

folloAving format 
ht tp-ll<repea t er>l<semr>@pro^<scbem>@l<p a tb> 

xle reflector can also attachaMIME type to theses, to cause .he 
.cpeate.top.ovidethatMlMEtype.iththeresult.THsisuse^ 

because many protocols (such as FIT) do not provide a way to attach a 
MIME type to a resource. The format is 

This URL is interpreted when received by the repeater. 
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The reflector then sends (B3-3) a REDIRECT reply containing 
this new URL to the requesting client. The HTTP REDIRECT 
command allows the reflector to send the browser a single URL to retry 
the request. 

B4. To serve a request locally, the request is sent by the reflector to 

("forwarded to") the origin server 102. In this mode, the reflector acts as 
a reverse proxy server. The origin server 102 processes the request in the 
normal manner (A5-A7). The reflector then obtains the origin server's 
reply to the request which it inspects to determine if the requested 
resource is an HTML document, i.e., whether the requested resource is 
one which itself contains resource identifiers. 

B5. If the resource is an HTML document then the reflector rewrites the 

HTML document by modifying resource identifiers (URLs) within it, as 
described below. The resource, possibly as modified by rewriting, is then 
returned in a reply to the requesting client 106. 

If the requesting client is a repeater, the reflector may temporarily 
disable any cache-control modifiers which the origin server attached to 
the reply. These disabled cache-control modifiers are later re-enabled 
when the content is served from the repeater. This mechanism makes it 
possible for the origin server to prevent resources from being cached at 
normal proxy caches, without affecting the behavior of the cache at the 
repeater. 

B6. Whether the request is reflected or handled locally, details about the 
transaction, such as the current time, the address of the requester, the 
URL requested, and the type of response generated, are written by the 
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reflector to a local log file. 
By using a rule base (B2), it is possible to selectively reflect resources. There are 
a number of reasons that certain particular resources cannot be effectively repeated (and 
therefore should not be reflected), for instance: 

the resource is composed uniquely for each request; 

the resource relies on a so-called cookie (browsers will not send cookies 

to repeaters with different domain names); 

the resource is actually a program (such as a Java applet) that will run on 
the client and that wishes to connect to a service (Java requires that the 
0 service be running on the same machine that provided the applet). 

In addition, the reflector 108 can be configured so that requests from certain 
network addresses (e.g., requests from clients on the same local area network as the 
reflector itself) are never reflected. Also, the reflector may choose not to reflect requests 
because the reflector is exceeding its committed aggregate information rate, as descnbed 
15 below. 

A request which is reflected is automatically mirrored at the repeater when the 
repeater receives and processes the request 

The combination of the reflection process described here and the caching 
process described below effectively creates a system in which repeatable resources are 
migrated to and mirrored at the selected reflector, while non-repeatable resources are 

not mirrored. 

Alternate Approach 

Placing the origin server name in the reflected URL is generally a good strategy, 
but it may be considered undesirable for aesthetic or (in the case, e.g., of cookies) certain 

25 technical reasons. 

It is possible to avoid the need for placing both the repeater name and the server 
name in the URL. Instead, a "family" of names may be created for a given origin server, 
each name identifying one of the repeaters used by that server. 
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For instance, if www.example.com is the origin server, names for three repeaters 
might be created: 

wrl .example.com 
wr2.example.com 
wr3.example.com 

The name "wrl.example.com" would be an alias for repeater 1, which might also 
be known by other names such as "wrl.anotherExample.com" and "wrl.example.edu". 

If the repeater can determine by which name it was addressed, it can use this 
information (along with a table that associates repeater alias names with origin server 
names) to determine which origin server is being addressed. For instance, if repeater 1 is 
addressed as wrl.example.com, then the origin server is "www.example.com"; if it is 
addressed as "wrl.anotherExample.com", then the origin server is 
"www.anotherExample.com". 

The repeater can use two mechanisms to determine by which alias it is 

addressed: 

1. Each alias can be associated with a different IP address. Unfortunately, 
this solution does not scale well, as IP addresses are currently scarce, and 
the number of IP addresses required grows as the product of origin 
servers and repeaters. 

2. The repeater can attempt to determine the alias name used by inspecting 
the "host" tag in the HTTP header of the request Unfortunately, some 
old browsers still in use do not attach the "host:" tag to a request. 
Reflectors would need to identify such browsers (the browser identity is 
a part of each request) and avoid this form of reflection. 

How a Repeater Handles a Request 

When a browser receives a REDIRECT response (as produced in B3), it reissues 



PCT/US99/01477 

WO 99/40514 



10 



15 



20 



25 



18 

a request for the resource using the new resource identifier (URL) (A1-A5). Because the 
new identifier refers to a repeater instead of the origin server, the browser now sends a 
request for the resource to the repeater which processes a request as follows, wnh 

reference to FIGURE 5: 

CI First the repeater analyzes the request to determine the network address 
of the requesting client and the path of the resource requested Included 
in the path is an origin server name (as described above with reference to 
B3). 

C2. The repeater uses an internal table to verify that the origin server belongs 
to a known "subscriber". A subscriber is an entity (e.g., a company) that 
publishes resources (e.g., files) via one or more origin servers. When the 
entity subscribes, it is permitted to utilize the repeater network. The 
subscriber tables described below include the ^formation that is used to 
link reflectors to subscribers. 

If the request is not for a resource from a known subscriber, the 
request is rejected. To reject a request, the repeater returns a reply 
indicating that the requested resource does not exist- 
ed The repeater then determines whether the requested resource is cached 
locally. If me requested resource is in the repeater's cache it is retrieved. 
On the other hand, if a valid copy of the requested resource is not in the 
repeater's cache, the repeater modifies the incoming URL, creating a 
request that it issues directly to the originating reflector which processes 
it (as in B1-B6). Because this request to the originating reflector is from 
a repeater, the reflector always returns the requested resource rather than 
reflecting the request. (Recall that reflectors always handle requests from 
repeaters locally.) If the repeater obtained the resource from the origin 
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server, the repeater then caches the resource locally. 

If a resource is not cached locally, the cache can query its "peer 
caches" to see if one of them contains the resource, before or at the 
same rime as requesting the resource from the reflector/ origin server. If 
5 a peer cache responds positively in a limited period of time (preferably a 

small fraction of a second), the resource will be retrieved from the peer 
cache. 

C4. The repeater then constructs a reply including the requested resource 
10 (which was retrieved from the cache or from the origin server) and sends 

that reply to the requesting client. 

C5. Details about the transaction, such as the associated reflector, the current 
rime, the address of the requester, the URL requested, and the type of 
n5 response generated, are written to a local log file at the repeater. 

Note that the bottom row of FIGURE 2 refers to an origin server, or a reflector, 
or a repeater, depending on what the URL in step Al identifies. 

Selecting the Best Repeater 

If the reflector 108 determines that it will reflect the request, it must then select 
the best repeater to handle that request (as referred to in step B3-1). This selection is 
performed by the Best Repeater Selector (BRS) mechanism described here. 

The goal of the BRS is to select, quickly and heurisrically, an appropriate repeater 
for a given client given only the network address of the client. An appropriate repeater 
25 is one which is not too heavily loaded and which is not too far from the client in terms 
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of some measure of network distance. The mechanism used here relies on specific, 
compact, pre-com P uted data to make a fast decision. Other, dynamic solutions can also 
be used to select an appropriate repeater. 

The BRS relies on three pre-computed tables, namely the Group Reduction 
Table the Link Cost Table, and the Load Table. These three tables (described below) 

are computed off-line and downloaded to each reflector by its contact in the repeater 

network. 

The Group Reduction Table places every network address into a group, with 
the goal that addresses in a group share relative costs, so that they would have the same 
best repeater under varying conditions (,e., the BRS is invariant over the members of 
the group). . 

The Link Cost Table is a two dimensional matrix which specifies the current 
cost between each repeater and each group. Initially, the link cost between a repeater 
and a group is denned as the "normalized link cost" between the repeater and the group, 
as denned below. Over time, the table will be updated with measurements which more 
accurately reflect the relative cost of transmitting a file between the repeater and a 
member of the group. The format of the link Cost Table is <Group ID> <Group 
LD> <link cost>, where the Group LD's are given as AS numbers. 

The Load Table is a one dimensional table which identifies the current load at 
each repeater. Because repeaters may have different capacities, the load is a value that 
represents the ability of a given repeater to accept additional work. Each repeater sends 
its current load to a central master repeater at regular intervals, preferably at least 
approximately once a minute. The master repeater broadcasts the Load Table to each 
reflector in the network, via the contact repeater. 

A reflector is provided entries in the Load Table only for repeaters which it is 
assigned to use. The assignment of repeaters to reflectors is performed centrally by a 
repeater network operator at the master repeater. This assignment makes it possible to 
modify the service level of a given reflector. For instance, a very active reflector may use 
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many repeaters, whereas a relatively inactive reflector may use few repeaters. 

Tables may also be configured to provide selective repeater service to subscribers 
in other ways, e.g., for their clients in specific geographic regions, such as Europe or 
Asia. 

Measuring Load 

In the presendy preferred embodiments, repeater load is measured in two 

dimensions, namely 

1 . requests received by the repeater per time interval (RRPT), and 

2. bytes sent by the repeater per time interval (BSPT). 

For each of these dimensions, a maximum capacity setting is set. The maximum 
capacity indicates the point at which the repeater is considered to be fully loaded. A 
higher RRPT capacity generally indicates a faster processor, whereas a higher BSPT 
capacity generally indicates a wider network pipe. This form of load measurement 
assumes that a given server is dedicated to the task of repeating. 

Each repeater regularly calculates its current RRPT and BSPT, by accumulating 
the number of requests received and bytes sent over a short time interval. These 
measurements are used to determine the repeater's load in each of these dimensions. If 
a repeater's load exceeds its configured capacity, an alarm message is sent to the repeater 
network administrator. 

The two current load components are combined into a single value indicating 
overall current load. Similarly, the two maximum capacity components are combined 
into a single value indicating overall maximum capacity. The components are combined 
as follows: 

current-load = B X current RRPT + ( 1 - B ) X 

current BSPT 
max-load = B X max RRPT + ( 1 - B ) X max BSPT 

The factor B, a value between 0 and 1 , allows the relative weights of RRPT and 
BSPT to be adjusted, which favors consideration of either processing power or 
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bandwidth. . 

The overall current load and overau maximum capacity values are penodrouly 
sent from each repeater to the master repeat, where .hey are aggregated in the Load 
Table, a table sunbathing the overall load for aU repeaters. Changes in the Load Table 
distributed automatically to each reflector. 

While the preferred embodiment uses a two^mensional measure of repeater 

load, any other measure of load can be used. 
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Combining Link Costs and Load 

The BRS computes the cost of servicing a given client from each eligible 
repeater. The cost is computed by combining the available capacity of the candidate 

computed by simply looking it up in the link Cost table. 

The cost is determined using the following formula: 



threshold - K * max-load 
capacity - max( max-load - current-load, e) 
capacity = min( capacity, threshold) 
cost - link-cost * threshold / capacity 



In this formula, . is a very small number (epsilon) and K is a tuning factor unual 
set to 0.5. This formula causes the cost to a given repeater to be increased, at a rate 
defined by K, if its capacity falls below a configurable threshold. 

Given the cost of each candidate repeater, the BRS selects all repeaters -dun a 
delta factor of the best score. From this set, the result is selected at random. 

The delta factor prevents the BRS from repeatedly selecting a single repeater 
wh en scores are similar. It is generally retired because available informadon about load 
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and link costs loses accuracy over time. This factor is tunable. 
Best Repeater Selector (BRS) 

The BRS operates as follows, with reference to FIGURE 6: 

Given a client network address and the three tables described above: 

El . Determine which group the client is in using the Group Reduction 
Table. 

E2. For each repeater in the link Cost Table and Load Table, determine that 
repeater's combined cost as follows: 

E2a. Determine the maximum and current load on the repeater (using 
the Load Table). 

E2b. Determine the link cost between the repeater and the client's 

group (using the Link Cost Table). 
E2c. Determine the combined cost as described above. 

E3. Select a small set of repeaters with the lowest cost. 
E4. Select a random member from the set- 
Preferably the results of the BRS processing are maintained in a local cache at 
the reflector 108. Thus, if the best repeater has recendy been determined for a given 
client (Le., for a given network address), that best repeater can be reused quickly without 
being re-determined. Since the calculation described above is based on statically, pre- 
computed tables, if the tables have not changed then there is no need to re-determine 
the best repeater. 
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Detent ft. Group Reduction and Ur.k Cos, Tab..* 

NetMap. The Ne/M<#> procedure is run by executing several phases (descn 

"^The term is used here to refers to an IP "address .roup". 

The .em, l*~r refers Co . Group that — the IP address 

^The term ft* - refers to a statically dererroined cos. for transmit data 

T ,„ a nresenu, preferred imputation, this is the — <* 
between two Groups. In a presently pre r The link costs of 

m e sums of the costs of the links along each pad, between then, The bnk costs 
em here are link costs between a Group and a Repeater Group. 

^ y Repeater Group from each of its link costs to a Repeater Group. 
The term Cost Set refers to a set of Groups that are equivalent in regard to 

would be selected for any of them. ro ^ an internal database 

The N«iM4> procedure first processes input Bes to create 

r. T L, These input files describe groups, the IP addresses wAm 
caued the Group Registry. These inp ^ 
croups andlinksbet«eengroups,andcome«vanetyo 

groups, an •„ m iO databases, BGP router tables, and probe 

Group Registry contains essenualmfornaanon used for turm p 

' «x the set of IP addresses in a given group, (3) the presence 

the identity of each group, (2) the set otiraa o 

the identity information may travel, and (4) the 

of links between groups mdicaong paths over which 
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cost of sending data over a given link. 

The following processes are then performed on the Group Registry file. 

Calculate Repeater Group link costs 

The NetMap procedure calculates a "link cost" for transmission of data between 
each Repeater Group and each Group in the Group Registry. This overall link cost is 
defined as the minimum cost of any path between the two groups, where the cost of a 
path is equal to the sum of the costs of the individual links in the path. The link cost 
algorithm presented below is essentially the same as algorithm #562 from ACM journal 
Transactions on Mathematical Software: "Shortest Path From a Specific Node to All 
Other Nodes in a Network" by U. Pape, ACM TOMS 6 (1980) pp. 450-455, 
http:/ / www.netlib.org/toms/562. 

In this processing, the term Repeater Group refers to a Group that contains the 

IP address of a repeater. A group is a neighbor of another group if the Group Registry 

indicates that there is a link between the two groups. 

For each target Repeater Group T: 

• Initialize the link cost between T and itself to zero. 

• Initialize the link cost between T and every other Group to infinity. 

• Create a list L that will contain Groups that are equidistant from the target 
Repeater Group T. 

• Initialize the list L to contain just the target Repeater Group T itself. 

• While the list L is not empty: 

• Create an empty list L f of neighbors of members of the list L. 

• For each Group G in the list L: 

• For each Group N that is a neighbor of G: 

• Let cost refer to the sum of the link cost between T and 

G, and the link cost between G and N. 
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The cost between T and G was determined in the 
previous pass of the algorithm; the link cost between G 
and N is from the Group Registry. 
If cost is less than the link cost between T and N: 

• Set the link cost between T and N to cost. 

• Add N to L' if it is not already on it. 



SetLtoL'. 



Calculate Cost Sets 

A Cost Set is . se, of Groups that are equivalent with respect to Best Repeater 

for any of them. n 
The "cost profile" of a Group G is defined herein as the set of costs between G 
and each Repeater. Two cos. profiles are said to be equWent if the values in one 
profile differ from the corresponding values i. the other profile b, a constant amount. 

Once a client Group is known, the Best Repeater Selection algorithm rehes on 
the cost profile for information about the Group. If two cos, profiles are equivalent, the 
BRS algorithm would select the same repeater given either profile. 

A Cost Set is then a set of groups that have equivalent cost profiles. 
The effectiveness of mis method can be seen, for e^mple, in the case where all 
pais to a Repeat from some Group A pass through some other Group B. The two 
Groups have equivalent cos, profiles (and are therefore in the same Cost Set) stnee 
whatever Repeater is best for Group A is also going to be best for Group B, regardless 
of what path is taken between the two Groups. 

By normalizing cost profiles, equivalent cost profiles can be made idenucal. A 
normalized cost profile is a cost profile in which the rninimum cost has the value zero^ 
A normalized cost profile is computed by finding the minimum cost in the profile, and 
subtracting that value from each cost in the profile. 
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Cost Sets are then computed using the following algorithm: 

• For each Group G: 

• Calculate the normalized cost profile for G 

• Look for a Cost Set with the same normalized cost profile. 

• If such as set is found, add G to the existing Cost Set; 

• otherwise, create a new Cost Set with the calculated normalized cost profile, 
containing only G. 

The algorithm for finding Cost Sets employs a hash table to reduce the time 
necessary to determine whether the desired Cost Set already exists. The hash table uses 
a hash value computed from cost profile of G. 

Each Cost Set is then numbered with a unique Cost Sent Index number. Cost 
Sets are then used in a straightforward manner to generate the link Cost Table, which 
gives the cost from each Cost Set to each Repeater. 

As described below, the Group Reduction Table maps every IP address to one 
of these Cost Sets. 

Build IP Map 

The IP Map is a sorted list of records which map IP address ranges to link Cost 
Table keys. The format of the IP map is: 

<base IP address> <max IP address> <Iink Cost Table key> 
where IP addresses are presently represented by 32-bit integers. The entries are sorted by 
descending base address, and by ascending maximum address among equal base 
addresses, and by ascending Link Cost Table key among equal base addresses and 
maximum addresses. Note that ranges may overlap. 

The NetMap procedure generates an intermediate IP map containing a map 
between IP address ranges and Cost Set numbers as follows: 
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► For each Cost Set S: 

• For each Group G in S: 

• For each IP address range in G: 

• Add a triple flow address, high address, Cost Set number of 

S) to the IP map. 

The IP map file is then sorted by descending base address, and by ascending 

„n„„g eoua. base addresses and ^ addresses. « °^ * *' *~ 

Id — add.es, minimizes me time to buiid me Group Reduce Tabic 

and produces the propel results for overlapping entnes. 

' Bna.iy^lN^proeedn.ecrea^meGto.pRedu^Tablebyp^g 

rhe sotted » map. The Group Reducdon Table maps !P addresses (specmed b, ranges, 
^Cos.Setnumbers.Spceia.pr^ssmgofmelPn^pmeisrequrredmorderto 

detec, overlapping address ranges, -d » ^ adjacent address r^ges « order 
ninimijc me size of the Group Reducdon Table. 

An ordered list of address range segments is mainlined, each segment cons.song 
ofabaseaddressBandaCos.SetnumberN.sc^bybaseaddressB. (Ih. 

ofasegmentisthebaseaddressof me »e« segment n»nus one.) 
The following algorithm is used: 

. initialize the list with the elements [-infinity, NOGROUP], [.infinity, NOGROUP]. 
. For each entry in the IP map, in sorted order, consisting of (b, m, s), 

. Insert (b, m, s) in the list (recall that IP map entries are of the form 
(low address, high address Cost Set number of S» 
. For each reserved LAN address range <b, m): 
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Insert (b, m, LOCAL) in the list. 

• For each Repeater at address a: 

Insert (a, a, REPEATER) in the list. 

• For each segment S in the ordered list: 

• Merge S with following segments with the same Cost Set 

• Create a Group Reduction Table entry with base address from the 
base address of S, 

• max address = next segment's base — 1, 

• group ID = Cost Set number of S. 

A reserved LAN address range is an address range reserved for use by LANs 
which should not appear as a global Internet address. LOCAL is a special Cost Set 
index different from all others, indicating that the range maps to a client which should 
never be reflected. REPEATER is a special Cost Set index different from all others, 
indicating that the address range maps to a repeater. NOGROUP is a special Cost Set 
index different from all others, indicating that this range of addresses has no known 
mapping. 

Given (B, M, N), insert an entry in the ordered address list as follows: 
Find the last segment (AB, AN) for which AB is less than or equal to B. 
If AB is less than B, insert a new segment OB, N) after (AB, AN). 
Find the last segment (YB, YN) for which YB is less than or equal to M. 
Replace by (XB, N) any segment (XB, NOGROUP) for which XB is greater 
than B and less than YB. 

If YN is not N, and either YN is NOGROUP or YB is less than or equal to B, 
Let (ZB, ZN) be the segment following (YB, YN). 
If M+l is less than ZB, insert a new segment (M+l, YN) 
before (ZB, ZN). 
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Replace (YB,YN) by (YB.N). 



Rewriting HTML Resources 

As ruined above with reference » «— » «>■ ^ * 
repeater serves a «— <** •»* ^ ~ ideadte (e *: 3 T^URLs) 

of ratable resou.es thatappear in the ~ « — *" 

^IJ resources idendfied by the revested — it wil go direct* » *« 
X VKthout this op— n, the browser wou,d either make a» rcouests - *. 
• • <„ver focreasinB traffic at the origin serve, and necessitating far mo.e 

*e repeater to redundant* reou-t and cop, resources w*ch cou!d not be cached, 
increasing the overhead of serving such resources). 

Rewriting reunites that a repeater has been select (as descnbed above wrth 
referenltheBestRepea^rSCector). Renting uses a scaled BASE dtrec«ve. 

normally the address of the HTML resource.) 
Rewriting is performed as follows: 

F1 A BASE dirccdve is added at the beginning of the HTML resource, or 
modified where necessary. Normally, a browser interprets relative URLs 
as being relative to the default base address, namely, the URL of the 
HTML resource (page) in which they are encountered. The BASE 
address added specifies Ac resource at the reflect which ongmalh, 
served the resource. This means that unprocessed reladve URLs (such , 
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those generated by Javascript™ programs) will be interpreted as relative 
to the reflector. Without this BASE address, browsers would combine 
relative addresses with repeater names to create URLs which were not in 
the form required by repeaters (as described above in step Dl). 

5 

F2. The rewriter identifies directives, such as embedded images and anchors, 
containing URLs. If the rewriter is running in a reflector, it must parse 
the HTML file to identify these directives. 

If it is running in a repeater, the rewriter may have access to pre- 
10 computed information that identifies the location of each URL (placed in 

the HTML file in step F4). 

F3. For each URL encountered in the resource to be re-written, the rewriter 
must determine whether the URL is repeatable (as in steps B1-B2). If 
the URL is not repeatable, it is not modified. On the other hand, if the 
URL is repeatable, it is modified to refer to the selected repeater. 

F4. After all URLs have been identified and modified, if the resource is being 
served to a repeater, a table is appended at the beginning of the resource 
that identifies the location and content of each URL encountered in the 
resource. (This step is an optimization which eliminates the need for 
parsing HTML resources at the repeater.) 

F5. Once all changes have been identified, a new length is computed for the 
resource (page). The length is inserted in the HTTP header prior to 
serving the resource. 
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An extension of HTML, known as XML, is currently being developed. 
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pI occss of similar for XMl^th somedifferences 

danism that parses the resource and identifies embedded TBI* 
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Handling Non-HTTP Protocols 

This invention ma.es i, possible to refect references to resources** are served 
by ptotocols other than HTTP, for instance, the 'f^^L » 

all/video stream protocols. However, man, protocoU do notprovde theab^ 

^n^ebyrewntingURL.embeddedmHTW.pages. Tne foUowmg 
^cations to the above ..gorimms are used to support mis capaburty. 

JnF4 the rewriter rev/rites URLs for servers if those servers appear in a 
configure able of cooperating origin server or socaueo co-servers. The re^or 
^JT« define mis tabic to indude FTP servers and omer servers. A rewntten 
URL that refers to a non-HTTP resource takes the form: 

F , u «o «W This URL format is an 

where <scheme> is a supported protocol name such as ftp . Thrs U 

alternative to the form shown in B3. „, 

,n C3 me repeat looks for a protocol embedded in the amvng request, 
^otocd is present and me rented .source is no, aireadv cacbed, 

serving it and storing it in the cache. 



System Configuration and Management 

,n addition to me processing described above, the repeater network recurs 
vanousmechanUn^forsvstemconfigurationandnetworkmanagement. Someotthese 

mechanisms are described here. 
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Reflectors allow their operators to synchronize repeater caches by performing 
publishing operations. The process of keeping repeater caches synchronized is 
described below. Publishing indicates that a resource or collection of resources has 
changed. 

Repeaters and reflectors participate in various types of log processing. The 
results of logs collected at repeaters are collected and merged with logs collected at 
reflectors, as described below. 

Adding Subscribers to the Repeater Network 

When a new subscriber is added to the network, information about the 
subscriber is entered in a Subscriber Table at the master repeater and propagated to all 
repeaters in the network. This information includes the Committed Aggregate Information 
Rate (CAIR) for servers belonging to the subscriber, and a list of the repeaters that may 
be used by servers belonging to the subscriber. 

Adding Reflectors to the Repeater Network 

When a new reflector is added to the network, it simply connects to and 
announces itself to a contact repeater, preferably using a securely encrypted certificate 
including the repeater's subscriber identifier. 

The contact repeater determines whether the reflector network address is 
permitted for this subscriber. If it is, the contact repeater accepts the connection and 
updates the reflector with all necessary tables (using version numbers to determine 
which tables are out of date). 

The reflector processes requests during this time, but is not "enabled" (allowed 
to reflect requests) until all of its tables are current. 
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Keeping Repeater Caches Synchronized 

Repeater caches are coherent, in the sense that when a change to a resource is 



transaction. 



0=1, Ae identifier of Ae changed resource (and no. the entire resource) . 

corresponding cached resource at Ae repeater. Tnis process is Br more efficen, Aan 
toadcasdng the content of Ac changed resource to each repeater. 

A repeater wul load Ac newly modified resource Ae next time it is requested. 
A resource change is identified a, Ae reflect eiAer manual* b y Ae operator, 
or through a script when files are installed on Ae server, or automatically through a 
d«_ detection mechanism (eg., a separate process that checks regularly for changes). 

A resource change causes Ae reflector to send an "invalidate" message to » 
contact repeater, which forwards Ae message to Ae master repeater. 
« mess^e contains , lis, of resource identifiers (or regolar expresstons 

ofreXceidennfi^A.thavechangec, (ReguUr egressions are used to mvahdate a 
directotyoranentireserver.) The repeater network uses a two-phase comnut process to 
ensure Aa, all repeaters correctly invalidate a given resource, 
the invalidation process operates as follows: 

The master broadcast a "phase 1- invalidation request to all repeaters indicant* 
rhe resources and regular expressions describing sets of resources - be invalidated. 

™,en each repeater receives Ae phase 1 message, it first places Ae resource 
identifiers or regular expresses inro a iist of resource identifiers pending invalidanon.^ 
Any resource requested (in C3) Aat is in Ae pending invalidation Us. may no. he 
23 served from Ae cache. This prevents Ae cache from requesting Ae resource from a 

~ cache which ma, no. have received an invalidation notice. Were i, to request a 
«source in this manner, i, might replace Ae new„ invalidated resource b, Ae same, 
now stale, data. 
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The repeater then compares the resource identifier of each resource in its cache 
against the resource identifiers and regular expressions in the list. 

Each match is invalidated by marking it stale and optionally removing it from the 
cache. This means that a future request for the resource will cause it to retrieve a new 
copy of the resource from the reflector. 

When the repeater has completed the invalidation, it returns an acknowledgment 
to the master. The master waits until all repeaters have acknowledged the invalidation 
request. 

If a repeater fails to acknowledge within a given period, it is disconnected from 
the master repeater. When it reconnects, it will be told to flush its entire cache, which 
will eliminate any consistency problem. (To avoid flushing the entire cache, the master 
could keep a log of all invalidations performed, sorted by date, and flush only files 
invalidated since the last time the reconnecting repeater successfully completed an 
invalidation. In the presently preferred embodiments this is not done since it is believed 
that repeaters will seldom disconnect.) 

When all repeaters have acknowledged invalidation (or timed out) the repeater 
broadcasts a "phase 2" invalidation request to all repeaters. This causes the repeaters to 
remove the corresponding resource identifiers and regular expressions from the list of 
resource identifiers pending invalidation. 

In another embodiment, the invalidation request will be extended to allow a 
"server push". In such requests, after phase 2 of the invalidation process has completed, 
the repeater receiving the invalidation request will immediately request a new copy of the 
invalidated resource to place in its cache. 



5 Logs and Log Processing 

Web server activity logs are fundamental to monitoring the activity in a Web site. 
This invention creates "merged logs" that combine the activity at reflectors with the 
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Web rescue n»de onbehlf of *« - ««, «P~- 

W. n.cgcd log - be processed „ ^ P— 8 •* - 

,^,3 the acav,* of ,11 -p ^ to ^ 

coofi^xed by ihc icflectot operator, s reflcctot co 

network, if the reflector is configured to do so oy 

1. request was served by a reflector locally; 

2. request was reflected to a repeaterf 

3 request was served by a reflector to a repeaterf 

4 request for non-rep^ ^ 

5 request was served by a repeater from the cache; 

6 request was served by a repeater after filling cache; 

7 request pending invalidation was served by a repeater. 

^ealr^ 

normally appear in a final activity log.) AaaM precision timestamps. 

In addition, activity logs contain a duranon, and extended prea 

• TKre«ended precision timestamp makes it possible to 
quite useful information. The extended prca 
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accurately merge activity logs. 

Repeaters use the Network Time Protocol (NTP) to maintain synchronized 
docks. Reflectors may either use NTP or calculate a time bias to provide roughly 
accurate timestamps relative to their contact repeater. 
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Enforcing Committed Aggregate Information Rate 
The repeater network monitors and limits the aggregate rate at which data is 
served on behalf of a given subscriber by all repeaters. This mechanism provides the 

following benefits: 

1. provides a means of pricing repeater service; 

2. provides a means for estimating and reserving capacity at repeaters; 

3. provides a means for preventing clients of a busy site from limiting access to 
other sites. 

For each subscriber, a "threshold aggregate information rate" (TAIR) is 
configured and maintained at the master repeater. This threshold is not necessarily the 
committed rate, it may be a multiple of committed rate, based on a pricing policy. 

Each repeater measures the information rate component of each reflector for 
which it serves resources, periodically (typically about once a minute), by recording the 
number of bytes transmitted on behalf of that reflector each time a request is delivered. 
The table thus created is sent to the master repeater once per period The master 
repeater combines the tables from each repeater, summing the measured information of 
each reflector over all repeaters that serve resources for that reflector, to determine the 
"measured aggregate information rate" (MAIR) for each reflector. 

If the MAIR for a given reflector is greater than the TAIR for that reflector, the 
MAIR is transmitted by the master to all repeaters and to the respective reflector. 

When a reflector receives a request, it determines whether its most recendy 
calculated MAIR is greater than its TAIR If this is the case, the reflector 



PCT/US99/01477 
WO 99/40514 



38 



10 



15 



20 



r u j;<Wnce between the MA1R and the CA1R. 

I„ the current system, this string has the form 
"src=ovedoad". 

^c reflector tests for this string in B2. •■, , „wc is 

TV mechanism for ^ Aggregate Information Kate desenbed above ts 
, v™« at the level of sessions «th clients (since once . ctom has been 
^co^U^-^ ^^^^^^ 
reflected to a given repeater, the rewn gp « m „« 6ne-K.ained 

TTJer^TAmMtswiAinrepeatersopeta^sbyreduangthe 
mechanism for entorang competing for 

. m ecW^-natcores--^»-^«' to '^ iSba ° g r;; 
subscriber in . data field associated »th the channel. Each nme 
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about to be made to the channel, the Metered Output Stream first inspects the current 
values of the MAIR and TAIR, calculated above, for the given subscriber. If the MAIR 
is larger than the TAIR, then the mechanism pauses briefly before performing the write 
operation. The length of the pause is proportional to the amount the MAIR exceeds the 
TAIR. The pause ensures that tasks sending other resources to other clients, perhaps on 
behalf of other subscribers, will have an opportunity to send their data. 

Repeater Network Resilience 

The repeater network is capable of recovering when a repeater or network 
connection fails. 

A repeater cannot operate unless it is connected to the master repeater. The 
master repeater exchanges critical information with other repeaters, including 
information about repeater load, aggregate information rate, subscribers, and link cost. 

If a master fails, a "succession" process ensures that another repeater will take 
over the role of master, and the network as a whole will remain operational. If a master 
fails, or a connection to a master fails through a network problem, any repeater 
attempting to communicate with the master will detect the failure, either through an 
indication from TCP/IP, or by timing out from a regular "heartbeat" message it sends to 
the master. 

When any repeater is disconnected from its master, it immediately tries to 
reconnect to a series of potential masters based on a configurable file called its 
"succession list". 

The repeater tries each system on the list in succession until it successfully 
connects to a master. If in this process, it comes to its own name, it takes on the role of 
master, and accepts connections from other repeaters. If a repeater which is not at the 
top of the list becomes the master, it is called the "temporary master". 

A network partition may cause two groups of repeaters each to elect a master. 
When the partition is corrected, it is necessary that the more senior master take over the 
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network. Therefore, when a repeater u tempos master, it «g^V «*• » — « 
„ any mastet above it in the succession Use If i. succeeds, it tamely disconnect 
tan all of the repeaters conneaed to it When they reny their succession lists, they wll 
connect to the more senior master repeater. 

To prevent losses of data, a temporary master does not accept configuration 
changes and does not process log files. In order to take on these tasks, it must be 
informed that it is primary master by manual modification of its successor hs. Each 

of who the master is. 

If . repeater is disconnected from *e master, it must resynchronize « cache 
when it reconnect the master. The master can maintain a lis, of recent cache 
invalidations and send to the repeater any invalidadons it was not ^e to process »hde 
disconnect If thisBs, is no. available for some reason (for instance, because the 
^fleeter has been disconnected too long), the reflector must invabdate its entire cache 

A reflector is not permitted to reflect requests unless it is connected to a 
repeater. The reflector relies on its contact repeater for critical information, such as toad 
»d Link Cost Tables, and cttrren. aggregate informal « A reflector that .s not 
connected to a repeater can continue to recede requests and handle mem locally. 

If a reflector loses its connection with a repeater, due to a .epeater fauure or 
networkoutage,i«condnuesmoper. tt »hileittnestocon»eato,^«er. 

Each time a reflect attempts to connect to a repeater, it uses DNS to .denffy a 
set of candidate repeaters pven , domain name te represent* the repeater network. 
The reflector tries each repeal m this se, until it makes a successful contact. Unnl a 

connects to a tepeater, the repeater can tell i, to attempt to contact a different repeater, 
this aUows the repeater nerwork to ensure that no individual repeater has too many 



contacts. 
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When contact is made, the reflector provides the version number of each of its 
tables to its contact repeater. The repeater then decides which tables should be updated 
and sends appropriate updates to the reflector. Once all tables have been updated, the 
repeater notifies the reflector that it may now start reflecting requests. 

Using a Proxy Cache within a Repeater 

Repeaters are intentionally designed so that any proxy cache can be used as a 
component within them. This is possible because the repeater receives HTTP requests 
and converts them to a form recognized by the proxy cache. 

On the other hand, several modifications to a standard proxy cache have been or 
may be made as optimizations. This includes, in particular, the ability to conveniendy 
invalidate a resource, the ability to support cache quotas, and the ability to avoid making 
an extra copy of each resource as it passes from the proxy cache through the repeater to 
the requester. 

In a preferred embodiment, a proxy cache is used to implement C3. The proxy 
cache is dedicated for use only by one or more repeaters. Each repeater requiring a 
resource from the proxy cache constructs a proxy request from the inbound resource 
request. A normal HTTP GET request to a server contains only the pathname part of 

the URL the scheme and server name are implicit. (In an HTTP GET request to a 

repeater, the pathname part of me URL includes the name of the origin server on behalf 
of which the request is being made, as described above.) However, a proxy agent GET 
request takes an entire URL. Therefore, the repeater must construct a proxy request 
containing the entire URL from the path portion of the URL it receives. Specifically, if 
the incoming request takes the form: 



GET I <origin server>/ <patb> 
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then the repeater constructs a proxy request of the form: 

GEYbttp:ll<ori5» scrver> I <p<*tb> 
and if the incoming request takes the form: 

GET <oripn ; W r>^=<^>:<^>@/ <P*> 
then the repeater constructs a proxy request of the form: 

GET <scbme>://<oriff» semr>/ <patb> 
Cache Control 

• j- rcWrA cache control directives, which are used 

HTTP replies contain directives called cacne con 

„ in*-* - a cache whe*et the — — .na, be cached and 

must a*. . she* «p^°» — » «* • — *- ~ ■ "* 

^ is fcjin the induatty - "WW- — ^ «*— 

^ cache-b,*»ng to be in^te, advetdaets ^° "* °" tafom " n0r ' 

colder it impend m be ignored by *e 

When a resource is stored in a repeater, its c*ui 

r epeatet, because the -P— «*- «~ " " ^ 

1_ is staie. Wbcn a pto„ ccbc is used - the cache at *e tepeatet, 

*« ««, is s-cd ton, *. cache to . dien, in o.det to pen*, the cache-control 
policy (taduding any cache-busdngj to take effect. 
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The present invention contains mechanisms to prevent the proxy cache within a 
repeater from honoring cache control directives, while permitting the directives to be 
served from the repeater. 

When a reflector serves a resource to a repeater in B4, it replaces all cache 
directives by modified directives that are ignored by the repeater proxy cache. It does 
this by prefixing a distinctive string such as "wr-" to the beginning of the HTTP tag. 
Thus, "expires" becomes "wr-expires", and "cache-control" becomes 
"wr-cache-control". This prevents the proxy cache itself from honoring the directives. 
When a repeater serves a resource in C4, and the requesting client is not another 
repeater, it searches for HTTP tags beginning with "wr-" and removes the "wr-". This 
allows the clients retrieving die resource to honor the directives. 



Resource Revalidation 

There are several cases where a resource may be cached so long as the origin 
server is consulted each time it is served. In one case, the request for the resource is 
attached to a so-called "cookie". The origin server must be presented with the cookie to 
record the request and determine whether the cached resource may be served or not. In 
another case, the request for the resource is attached to an authentication header (which 
identifies the requester with a user id and password). Each new request for the resource 
must be tested at the origin server to assure that the requester is authorized to access the 
resource. 

The HTTP 1.1 specification defines a reply header titled "Must-Revalidate" 
which allows an origin server to instruct a proxy cache to "revalidate" a resource each 
time a request is received Normally, this mechanism is used to determine whether a 
resource is still fresh. In the present invention, Must-Revalidate makes it possible to ask 
an origin server to validate a request that is otherwise served from a repeater. 
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•n,e reflector rule base contains information that determines which resources 
may be repeated but m»s. te revalidated each time rhey are served. For each such 
resource, in B4>c reflector attaches a Must-Revalidate header. Each dme a request 
comes to a repeater for a cached resource marked with a Must-RevaBdatc header, the 
request is forw»ded to the reflector for validation prior to serving the requested 



resource. 



Cache Quotas 

The cache component of a repeater is shared among those subscribers that 
reflect clients to that repeater. In order to allow subscribers fair access to storage 
facilities, the cache may be extended to support quotas. 

Normally, a proxy cache may be configured with a disk space threshold T. 
Whenever more than T bytes are stored in the cache, the cache attempts to find 

resources to eliminate. 

Typically a cache uses the least-recently-used (LRU) algorithm to deternune 
which resources to eliminate; more sophisticated caches use other algorithms. A cache 

.cached, causes a low priority background process to remove items from the cache, and 
a higher threshold which, when reached, prevents resources from being cached unul 
sufficient free disk space has been reclaimed. 

If two subscribers A and B share a cache, and more resources of subscriber A 

behavior of A, B's resources will never be cached when they are requested. In the 
prcS ent invention, this behavior is undesirable. To address this issue, the invenuon 
extends the cache at a repeater to support cache quotas. 

The cache records the amount of space used by each subscriber in D s , and 
supports a configurable threshold T s for each subscriber. 
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Whenever a resource is added to the cache (at C3), the value D s is updated for 
the subscriber providing the resource. If D s is larger than T s , the cache attempts to find 
resources to eliminate, from among those resources associated with subscriber S. The 
cache is effectively partitioned into separate areas for each subscriber. 

The original threshold T is still supported. If the sum of reserved segments for 
each subscriber is smaller than the total space reserved in the cache, the remaining area 
is "common" and subject to competition among subscribers. 

Note, this mechanism might be implemented by modifying the existing proxy 
cache discussed above, or it might also be implemented without modifying the proxy 

cache if the proxy cache at least makes it possible for an external program to obtain a 

list of resources in the cache, and to remove a given resource from the cache. 



Rewriting from Repeaters 

When a repeater receives a request for a resource, its proxy cache may be 
configured to determine whether a peer cache contains the requested resource. If so, 
the proxy cache obtains the resource from the peer cache, which can be faster than 
obtaining it from the origin server (the reflector). However, a consequence of this is that 
rewritten HTML resources retrieved from the peer cache would identify the wrong 
repeater. Thus, to allow for cooperating proxy caches, resources are preferably rewritten 
at the repeater. 

20 When a resource is rewritten for a repeater, a special tag is placed at the 

beginning of the resource. When constructing a reply, the repeater inspects the tag to 
determine whether the resource indicates that additional rewriting is necessary. If so, the 
repeater modifies the resource by replacing references to the old repeater with references 
to the new repeater. 

25 it is only necessary to perform this rewriting when a resource is served to the 

proxy cache at another repeater. 
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Repeater-Side Include 

Somedmes, an ori*» sever consmrcts . custom — - *>< «" 

,„ such a case, that resource must be served feed, — *- -P— • «■ 
IL™^.^^^^— n, .^references to 

^^L^^nneaby^Webb^cWcdbvHim. However, 

■^^^^^'^^'^^'^^ 
„ to . separate resource. Therefore, custom resources often necessanly 
con^Urge^ounrsofst.ic^.towouldotherwiseberepe.table. 

TO resolve this potent ineffidenev, repeals recognise . speoal threes 
esUed.-repeatersideindude-. This directive makes it possible for the repeater to 

Zl« static « can he made repcatabie, and on* the speci, Arecnve need be 

served locally by the reflector. 

For e^pie.aresourceXn.gh, consist of c.stomcirecdvesselecungan 

^dverdsing banner, foUowed by a « -de. To make .us resource repeal the 
Web site .dn^tormusthreakoutasecondresource.Y.to select the banner. 

yl.gwithtnearfd. Resc-ceY is created and contains only the custom^enves 
sdecdnganadbanne, Now resource X is repeataUe, and only resource Y,wh>ch ,s 

relatively small, is not repeatablc 

When a repeater constructs a rep!y, it determines whether the resource betng 
served is an HTML resource, and if so, scans it for repeater-side indude droves. 
Each such direcove includes a XW, which the repeater -*«. and subsututes - p^ 
of rhe direcdve. Tne endre rescue must he assembled before i, is sen,ed, - order to 
determine its final size, as the size is induded in , reply header ahead of the resource. 
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Thus, a method and apparatus for dynamically replicating selected resources in 
computer networks is provided. One skilled in the art will appreciate that the present 
invention can be practiced by other than the described embodiments, which are 
presented for purposes of illustration and not limitation, and the present invention is 
limited only by the claims that follow. 
What is claimed: 

1 . A method of processing resource requests in a computer network, the 

method comprising, 
(i) by a client 

(A) making a request for a particular resource from an origin server, 
the request including a resource identifier for the particular 
resource; 

(n) by a reflector. 

(B) intercepting the request from the client to the origin server, 
(Q selecting a repeater to process the request; 

(D) providing to the client a modified resource identifier designating 
the repeater, 

(iii) by the client 

(E) receiving the modified resource identifier from the reflector, and 

(F) making a request for the particular resource from the repeater 
designated in the modified resource identifier, 

(iv) by the repeater. 

(G) receiving the request from the client; and 

(H) returning the requested resource to the client. 2. A method 
as in claim 1 further comprising, by the repeater. 

(I) making a request for the resource from the origin server; and 
(J) receiving the resource from the origin server. 
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3. A method as in claim 1 wherein the selecting of a repeater by the 

reflector comprises: 

(CI) partitioning the network into groups; 
(C2) determining which group the client is in; 

(C3) selecting, from a plurality of repeaters in the network, a set of repeaters 

having a lowest cost relative to the group which the client is in; and 
(C4) selecting as the repeater a member of the selected set of repeaters. 

4. A method as in claim 3, wherein the cost of a repeater is a value based on 
that repeater's current load and a maximum load for that repeater. 

5. A method as in claim 3, wherein the cost of a repeater is a value based on 
a predicted cost or speed of transmission between the repeater and a client in the group. 

6. A method as in claim 1 wherein the particular resource itself contains at 
least one other resource identifier of at least one other resource, the method further 

comprising: 

rewriting the particular resource to replace at least some of the resource 
identifiers contained therein with modified resource identifiers designating a repeater 
instead of the origin server. 

7. A method as in claim 6 wherein the rewriting is performed by one of the 
repeater, the reflector or another repeater. 

8. A method of processing resource requests in a computer network, the 

method comprising, 
(i) by a client: 
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(A) making a request for a particular resource from an origin server, 
the request including a resource identifier for the particular 
resource; 

Qi) by a reflector 

(B) intercepting the request from the client to the origin server, 
(Q determining whether to reflect the request to a repeater, 
(D) when the reflector determines not to reflect the request, 

forwarding the request to the origin server, otherwise 
(Dl) selecting a repeater to process the request; 
(D2) providing to the client a modified resource identifier 
designating the repeater. 

9. A method as in claim 8, further comprising, when the reflector 
determines to reflect the request, 
(m) by the client: 

(E) receiving the modified resource identifier from the reflector, and 

(F) making a request for the particular resource from the repeater 
designated in the modified resource identifier; 

(iv) by the repeater. 

(G) receiving the request from the client; and 

(H) returning the requested resource to the client. 

10. A method as in claim 8 wherein the reflector determines whether to 
reflect a request by comparing the resource identifier with regular expression patterns of 
repeatable resources. 
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„ A method as in claim 8, wherein the reflector has a threshold aggregate 
information rate (TAIR) associated therewith, and wherein the determining of whether 
to reflect the request to a repeater comprises: 

determining whether the TAIR. of the reflector is exceeded by a measured 
aggregate information rate (MAIR) for the reflector, wherein the reflector determines 

12 A method as in claim 8, wherein the reflector has a threshold aggregate 
information rate (TAIR) associated therewith, and wherein the determining of whether 
to reflect the request to a repeater comprises: 

probabilistically determining whether the TAIR of the reflector is exceeded 
measuredaggregateinformationrateCMAIR) for me reflector, wherein the reflector 
determines not to reflect the request as an exponential Action of the deference 
between the MAIR and the TAIR. 

13 A method as in any of claims 11-12, wherein the MA!R is obtained ton, 
repeated according to the rate a, which they have transmitted data on behaif of the 
reflector during a given time interval. 

,4 A method as in any one of claims 1-12 wherein the network is the 
Interne, and wherein Ac resource identifier is a uniform resource locator (U^ to 
desigaadng resources on the Internet, arid wherein the „odi6ed resource .denufier .s 
U^esignadng the repeater and indicating the reflector or * serve, and wherem 
the modiL resource idenofier is provided to the client using a REDIRECT message. 

,5. In a computer network wherein clients request resources from origin 

servers, a method comprising: 

providing at least one repeater, 
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providing reflectors at some of the origin servers, each reflector intercepting 
client resource requests made to its respective origin server; and 

each reflector selectively redirecting client resource requests for certain resources 
to one of the repeaters. 

16. A method as in claim 15 further comprising, by repeaters in the network: 
servicing redirected client resource requests; and 
selectively maintaining copies of requested resources, 

whereby resources corresponding to redirected resource requests are selectively 
migrated from their origin servers to one or more repeaters. 

17. A computer network comprising: 

a plurality of origin servers, at least some of the origin servers having reflectors 
associated therewith; 

, a plurality of repeaters; and 
a plurality of clients, 

wherein each reflector is adapted to intercept resource requests made to its 
respective origin server and to selectively redirect the resource requests to a dynamically 
selected repeater. 

18. In a computer network wherein clients request resources from origin 
servers, a reflector mechanism associated with an origin server, the reflector mechanism 
comprising: 

means for intercepting a resource request made by client of an origin server, 
means for analyzing the resource request to determine whether to service the 
request locally at the origin server; 

means for determining a best repeater in the network to service the request when 
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the analyzing means determines that the request should not be serviced locally; and 
means for redirecting the client to the best repeater. 

19. A reflector mechanism as in claim 18 wherein the network is partitioned 
into groups and the means for determining the best repeater comprises: 
means for determining which group the client is in; 
means for selecting, from a plurality of repeaters in the network, a set of 
repeaters having a lowest cost relative to the group the client is in; and 

means for selecting as the best repeater a member of the set of repeaters. 

20. A reflector mechanism as in claim 19, wherein the cost of a repeater is a 
value based on a predicted cost or speed of transmission between the repeater and a 
client in the group. 

21. A mechanism as in claim 19, wherein the cost of a repeater is a value 
based on that repeaters current load and a maximum load for that repeater. 

22. A reflector as in claim 16 wherein the resource itself contains resource 
identifiers, the reflector further comprising: 

means for rewriting the resource to replace at least some of the resource 
identifiers contained therein with modified resource identifiers designating the repeater 
instead of the origin server. 

23. In a computer network wherein clients request resources from origin 
servers, a repeater mechanism comprising: 

means for receiving a resource request from a client; 

means for determining whether the resource is available locally; 

means for, when it is determined that the resource is not available locally, 
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obtaining the resource from an origin server, and 

means for providing the resource to the client. 

24. A reflector as in claim 18 wherein the resource itself contains resource 
identifiers, the repeater further comprising: 

means for rewriting the resource to replace at least some of the resource 
identifiers contained therein with modified resource identifiers designating the repeater 
instead of the origin server. , 
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